Resampling methods for parameter-free and robust feature selection with mutual information
نویسندگان
چکیده
Combining the mutual information criterion with a forward feature selection strategy offers a good trade-off between optimality of the selected feature subset and computation time. However, it requires to set the parameter(s) of the mutual information estimator and to determine when to halt the forward procedure. These two choices are difficult to make because, as the dimensionality of the subset increases, the estimation of the mutual information becomes less and less reliable. This paper proposes to use resampling methods, a Kfold cross-validation and the permutation test, to address both issues. The resampling methods bring information about the variance of the estimator, information which can then be used to automatically set the parameter and to calculate a threshold to stop the forward procedure. The procedure is illustrated on a synthetic data set as well as on the real-world examples. r 2007 Elsevier B.V. All rights reserved.
منابع مشابه
Feature Selection Using Multi Objective Genetic Algorithm with Support Vector Machine
Different approaches have been proposed for feature selection to obtain suitable features subset among all features. These methods search feature space for feature subsets which satisfies some criteria or optimizes several objective functions. The objective functions are divided into two main groups: filter and wrapper methods. In filter methods, features subsets are selected due to some measu...
متن کاملMutual information-based feature selection for multilabel classification
This paper introduces a new methodology to perform feature selection in multi-label classification problems. Unlike previous works based on the χ2 statistics, the proposed approach uses the multivariate mutual information criterion combined with a problem transformation and a pruning strategy. This allows us to consider the possible dependencies between the class labels and between the features...
متن کاملThe permutation test for feature selection by mutual information
The estimation of mutual information for feature selection is often subject to inaccuracies due to noise, small sample size, bad choice of parameter for the estimator, etc. The choice of a threshold above which a feature will be considered useful is thus difficult to make. Therefore, the use of the permutation test to assess the reliability of the estimation is proposed. The permutation test al...
متن کاملAdvances in Feature Selection with Mutual Information
The selection of features that are relevant for a prediction or classification problem is an important problem in many domains involving high-dimensional data. Selecting features helps fighting the curse of dimensionality, improving the performances of prediction or classification methods, and interpreting the application. In a nonlinear context, the mutual information is widely used as relevan...
متن کاملClassification of Right/Left Hand Motor Imagery by Effective Connectivity Based on Transfer Entropy in EEG Signal
The right and left hand Motor Imagery (MI) analysis based on the electroencephalogram (EEG) signal can directly link the central nervous system to a computer or a device. This study aims to identify a set of robust and nonlinear effective brain connectivity features quantified by transfer entropy (TE) to characterize the relationship between brain regions from EEG signals and create a hierarchi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Neurocomputing
دوره 70 شماره
صفحات -
تاریخ انتشار 2007